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REAL PARTY IN INTEREST 



The real party in interest in the present Appeal is International Business Machines 
Corporation (IBM), the Assignee of the present Application, as evidenced by the Assignment set 
forth and recorded at Reel 9864, Frame 0552. 

RELATED APPEALS AND INTERFERENCES 

There are no other appeals or interferences known to Appellant, the Appellant's legal 
representative, or assignee, which directly affect or would be directly affected by or have a 
bearing on the Board's decision in the pending appeal. 

STATUS OF CLAIMS 

Claims 1, 3-6, and 13-25 which comprise all pending claims, stand finally rejected as 
noted in the Examiner's Office Action dated July 15, 2002. Claims 2 and 7-12 have been 
cancelled. The rejection of each pending claim is appealed. 

STATUS OF AMENDMENTS 

No Amendment to the Claims has been submitted subsequent to the Final Rejection. 

SUMMARY OF INVENTION 

As set forth at Page 7, line 3, et seq., of the present Specification, the present invention 
is directed to a technique for reducing the number of attributes of a sample population employed 
in generating a predictive model based on the sample population. Reducing the number of 
attributes of a sample population reduces the amount of computational resources required for 
predictive modeling and improves the accuracy of the resulting predictive model by removing 
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samples that are not strongly related to the target population which could otherwise skew the 
results. 

The technique of the present invention may best be understood with reference to Figure 3, 
which illustrates a flow chart for practicing the present invention. The process begins at step 302, 
which depicts a model build being initiated. A data set, from which a sample population may be 
drawn including at least one sample having a desired attribute, should be available for building 
the desired predictive model. If less than the entire data set is employed in generating the 
predictive model, the resulting predictive model may then be applied to the remaining sample in 
the data set. The desired attribute(s) for which the predictive model is generated need not have 
only two possible values, but may be a relative measure such as a value exceeding a 
predetermined threshold. 

The process first passes to step 304, which illustrates grouping the elements of the sample 
population based on the values of the attribute(s) to be the subject of prediction, identifying a 
target group of samples. The process then passes to step 306, which depicts selecting an attribute 
and determining a relative difference or divergence in the attribute values for target group 
samples versus the whole sample population. A relative difference (e.g., ratio or percentage) 
should be determined since comparison of absolute differences may not be meaningful. 

The process then passes to step 308, which illustrates a determination of whether all 
attributes available for the sample population, other than those for which the predictive model 
is being built, have been considered. If all attributes for the sample population have not been 
considered, the process returns to step 306 to select another attribute for analysis and repeat the 
process of steps 306 with the newly selected attribute. 

Once all attributes for the sample population have been analyzed, the process proceeds 
from step 308 to step 310, which depicts selecting n attributes exhibiting the largest relative 
differences for samples having the desired attributes as compared to all samples within the 
sample population. A sort or ranking of the attributes by such relative difference may be useful 



Docket No. AT9-99-037 
Page 3 




in this step. The number n of attributes selected may be any arbitrarily set number or, as 
described above, may be a predetermined percentage of the attributes or attributes exhibiting a 
relative difference between samples which exceeds a predetermined threshold. 

The process next passes to step 312, which illustrates building a model for the desired 
attribute and the sample population utilizing the selected attributes. Various known techniques 
may be employed for this purpose. The process passes then to step 314, which depicts applying 
the predictive model generated to a data set. Finally, the process passes to step 316, which 
illustrates the process becoming idle until another model build is undertaken. 

The present invention allows data collections to have large numbers of potentially 
irrelevant or meaningless attributes for each sample to be employed in building an accurate 
predictive model. Efficiency in generating the predictive model is improved by reducing the 
number of attributes which are considered during the model build. This requires both less time 
and less computational resources to generate the predictive model. Accuracy of the resulting 
predictive model is also improved. Attributes which might skew the sample population but have 
no relation to the desired characteristic-or less relation to the desired attribute than other 
attributes-are eliminated from consideration in building the predictive model. 

ISSUE ON APPEAL 

Is the Examiner's rejection of Claims 1, 3-6, and 13-25 under § 103(a) as being 
unpatentable over Piatetsky-Shapiro (Shapiro), "Discovery, Analysis, and Presentation of Strong 
Rules" AAAI/MIT Press 1991, in view of Simoudis, et a/., U.S. Patent No. 5,692,107, and in 
further view of Dash, etaL, "Dimensionality Reduction of Unsupervised Data," IEEE 1997, well 
founded? 
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GROUPING OF CLAIMS 



For purposes of this Appeal, Claims 1, 3-6, and 13-15 stand or fall together as a single 

group. 

ARGUMENT 

The Examiner has rejected Claims 1, 3-6, and 13-25 under 25 U.S.C. 103(a) as being 
unpatentable over Piatetsky-Shapiro (Shapiro), "Discovery, Analysis, and Presentation of Strong 
Rules," in "Knowledge Discovery in Database," AAAI/MIT Press, 1991, in view of Simoudis, 
etaL, U.S. Patent No. 5,692,107, and in further view of Dash, etaL, "Dimensionality Reduction 
of Unsupervised Data," Proceedings, Ninth IEEE International Conference on Tools with 
Artificial Intelligence, Nov. 1997. Appellants contend such rejection is not well founded and 
should be reversed. 

The Examiner relies upon Shapiro to teach the claim limitation of "comparing said one 
or more desired attributes and respective values with said sample population to obtain a target 
population. " First, Appellants point out that the reference is devoid of any teaching for obtaining 
a target population. The "target population" is an important claim element as a statistical measure 
of difference between attributes and respective values in the "target population" as compared to 
the sample population to "reducing the number of attributes and respective values of the sample 
population." 

The Examiner believes, as indicated in the Advisory Action mailed on 09/05/2002, that 
"Pietetsky-Shapiro expressly teaches obtaining a target population in the last two lines of 
page 235" which recite: 

At the end, a cell for A = a contains the summary of all the file tuples satisfying 
A = a. The summary can be presented to the user or used for deriving rules 
implied by A = a. 
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Appellants contend the above lines only indicate that the result of the KID3 algorithm taught by 
Shapiro produces a summary of the sample population, and not obtaining a target population. 
Support for Appellants interpretation may be found at page 235 of Shapiro which recites: "I 
present here the KID3 algorithm that finds, in parallel, all simple exact rules of the form (A = a) 
--> cond(Bi) t? and "... the cell summary is updated ..." Appellants contend that a summary of a 
sample population is not a target population and having included the definitions of "population," 
"sample population," and "target population" found on the Portland State University website. 

Second, Shapiro does not teach or suggest determining a statistical measure of difference 
between each of the attributes and respective values of the target population and sample 
population as recited in Claim 1 . In the claimed invention, the selected target population is 
compared to the entire sample population to determine which attributes and respective values are 
most likely relevant in computing a predictive model. The comparison of a target population to 
the sample population yields different results than simply reducing a data set to a set of rules as 
in Shapiro. The results depend on the selected target group and not the population as a whole. 
Different target groups may result in a different selection of most relevant attributes. For 
example, a target group for the purchase of a type of pizza may show a strong correlation with 
age and no other attribute while the target group for the purchase of an expensive product may 
show a correlation with income. 

The Examiner asserts that Simoudis teaches the selection of a data analysis module to 
perform data mining, including the use of a target population Appellants acknowledge that 
Simoudis teaches the use of a target population that is employed in generating a predictive model. 
However, Simoudis does not teach "comparing said one or more desired attributes and respective 
values with said sample population to obtain a target population" as recited by the claims in the 
present invention. Simoudis only teaches that the target data set typically represents a subset of 
a larger underlying data source and may be compiled from sources with difference data formats 
(Col. 4 lines 16-17). The present invention teaches a technique, not found in the prior art, for 
selecting a target group by comparing attributes values of the sample population to desired 
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values and reducing the number of attributes by determining the statistical measure of difference 
between the attributes of the target and sample populations. 

In rejecting Claim 3, the Examiner relies upon Dash to teach using entropy as a statistical 
measure. The Examiner does not appear to use Dash in rejecting Claim 1. Dash teaches 
dimensionality reduction of unsupervised data and an entropy measure. Dash is silent regarding 
the reduction of variables based on a difference between the attributes and respective values of 
a target group and a sample population. 

For a rej ection under § 1 03(a) to be well founded, the Examiner must present prior art that 
teaches or suggests every limitation of the claim(s) rejected. The combination of Shapiro, 
Simoudis, and Dash do not teach or suggest every claim limitation of the present invention. Most 
notably, the cited prior art lacks any teaching of determining a statistical measure of difference 
between the attributes and respective values of a target population and a sample population or 
comparing attributes and respective values with a sample population to obtain a target 
population. Accordingly, Appellants contend the rejection under § 1 03(a) is not well founded and 
should be reversed. 

CONCLUSION 

In light of the above arguments, Appellants contend the claimed invention is not taught 
or suggested by the art relied upon by the Examiner. Consequently, Appellants urge that this 
rejection is also not well-founded and it should be reversed. 

Please charge IBM Corporation Deposit Account No. 09-0447 in the amount of $320.00 
for submission of a Brief in Support of an Appeal. No additional fees or expenses are believed 
to be required; however, if any additional fees are required, please charge IBM Corporation 
Deposit Account No. 09-0447. 
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Respectfully submitted, 



Brian P. Klawinski 

Reg. No. 51,087 

Bracewell & Patterson, L.L.P. 

P.O. Box 969 

Austin, Texas 78767-0969 

(512) 472-7800 

(512) 472-9123 Facsimile 

ATTORNEY FOR APPLICANT 
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APPENDIX 



1 -1 . (Amended) A method of reducing the number of the number of attributes and respective 

2 values of a sample population employed in generating a predictive model, said method 

3 comprising the steps of: 

4 obtaining one or more desired attributes and respective values; 

5 comparing said one or more desired attributes and respective values with said sample 

6 population to obtain a target population; 

7 determining a statistical measure of difference between each of the attributes and 

8 respective values of said target population and the attributes and respective values of the sample 

9 population; and 

10 utilizing said statistical measure of difference to reduce the number of attributes and 
i i respective values of said sample population.- 

-2. (Cancelled) 

1 -3. (Amended) The method of claim 1 , wherein the step of determining a statistical measure of 

2 difference further comprises: 

3 determining an entropy for the attribute values.-- 

1 ~4. (Amended) The method of claim 1, wherein the step of utilizing said statistical measure to 

2 reduce the number of attributes and respective values of said population further comprises: 

3 identifying n attributes having a largest difference in respective values with said target 

4 population.- 

1 ~5. (Amended) The method of claim 1, wherein the step of utilizing said statistical measure to 

2 reduce the number of attributes and respective values of said population further comprises: 

3 identifying a predetermined percentage of attributes and respective values having a larger 

4 statistical measure of difference than remaining attributes and respective values.— 
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1 -6. (Amended) The method of Claim 1 , wherein the step of utilizing said statistical measure to 

2 reduce the number of attributes and respective values of said population further comprises: 

3 identifying attributes and respective values where said statistical measure of difference 

4 exceeds a predetermined amount.-- 



-7. (Cancelled) 
-8. (Cancelled) 
-9. (Cancelled) 
--10. (Cancelled) 
-11. (Cancelled) 
--12. (Cancelled) 



1 -13. (Amended) A method of selecting attributes for computing a model, comprising: 

2 for a plurality of samples each having values for a plurality of attributes: 

3 for each of the plurality of attributes: 

4 comparing the attribute values for a target group of samples to the 

5 attribute values for all of the plurality of samples; and 

6 determining a difference between the attribute values for the target groups 

7 and the attribute values for all of the plurality of samples; and 

8 identifying attributes within the plurality of attributes having a largest 

9 difference between the attribute values for the target groups and the attribute 

10 values for all of the plurality of samples; and 

i i selecting at least some of the identified attributes.-- 



-14. (Amended) A system for selecting attributes for computing a model, comprising: 
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a memory containing data for a plurality of samples each having values for a plurality of 
attributes; and 

a processor coupled to the memory and executing a selection process including: 

comparing attribute values for samples having a desired attribute value to attribute 

values for all samples; 

selecting a subset of available attributes based on a difference between attribute 

values for the samples having the desired attribute value and attribute values for all of the 

samples; and 

employing the selected subset of attributes to generate a predictive model. 

15. (Unchanged) The system of claim 14, wherein the selection process determines a statistical 
measure of difference between the attribute values for samples having the desired attribute and 
the attribute values for all of the samples. 

16. (Unchanged) The system of claim 15, wherein the selection process determines an entropy 
for the attribute values. 

17. (Unchanged) The system of claim 14, wherein the selection process identifies a 
predetermined number of attributes having a largest difference in the attribute values for 
selection. 

18. (Unchanged) The system of claim 14, wherein the selection process identifies a 
predetermined percentage of attributes having a larger difference in the attribute values for 
selection. 

19. (Unchanged) The system of claim 14, wherein the selection process identifies, for selection, 
attributes having a difference in the attribute values exceeding a predetermined amount. 

-20. (Amended) A system for computing a model, comprising: 
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2 a memory containing data for a plurality of samples each having values for a plurality of 

3 attributes; and 

4 a processor coupled to the memory and executing a selection process including: 

5 comparing attribute values for a target subset of the plurality of samples to 

6 attribute values for all of the samples; 

7 selecting attributes having a largest difference between attribute values for the 

8 target subset and attribute values for all of the samples; and 

9 computing a model employing the selected attributes.-- 

1 -21. (Amended) A computer usable medium for selecting attributes for computing a model, said 

2 computer usable medium comprising: 

3 computer program code for reading values of attributes for a plurality of samples; 

4 computer program code for comparing attribute values for samples having a desired 

5 attribute value to attribute values for all samples; and 

6 computer program code for selecting a subset of available attributes based on a difference 

7 between attribute values for samples having the desired attribute value and attribute values for 

8 all samples.- 

1 -22. (Amended) The computer usable medium of claim 21, wherein the instructions for 

2 comparing attribute values for samples having a desired attribute value to attribute values for all 

3 samples further comprise: 

4 computer program code for determining a statistical measure of difference between the 

5 attribute values for samples having the desired attribute value and the attribute values for all 

6 samples.— 

1 -23. (Amended) The computer usable medium of claim 22, wherein the instructions for 

2 determining a statistical measure of difference between the attribute values for samples having 

3 the desired attribute value and the attribute values for all samples further comprise: 

4 computer program code for determining an entropy of the attribute values for samples 

5 having the desired attribute value and an entropy of the attribute values for all samples; 
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computer program code for comparing the entropy of the attribute values for samples 
having the desired attribute value to the entropy of the attribute values for all samples for each 
attribute to determine a relative measure of difference; and 

computer program code for comparing the relative measure of difference of all attributes.- 



-24. (Amended) The computer usable medium of claim 21, wherein the instructions for 
selecting a subset of available attributes based on a difference between attribute values for 
samples having the desired attribute value and attribute values for all samples further comprise: 
computer program code for identifying n attributes having a largest difference in the 
attribute values.— 

-25 . (Amended) A computer usable medium for selecting attributes for computing a model, said 
computer usable medium comprising: 

computer program code for comparing attribute values for a target group of samples to 
attribute values for all samples for each of a plurality of attributes; 

computer program code for determining a difference between the attribute values for the 
target group of samples and the attribute values for all of the samples; and 

computer program code for selecting a group of attributes having a largest difference 
between the attribute values for the target group of samples and the attribute values for all 
samples.— 
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